-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Zen2] Introduce ClusterBootstrapService #35488
[Zen2] Introduce ClusterBootstrapService #35488
Conversation
Today, the bootstrapping of a Zen2 cluster is driven externally, requiring something else to wait for discovery to converge and then to inject the initial configuration. This is hard to use in some situations, such as REST tests. This change introduces the `ClusterBootstrapService` which brings the bootstrap retry logic within each node and allows it to be controlled via an (unsafe) node setting.
Pinging @elastic/es-distributed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few small comments, looks good o.w.
|
||
@Override | ||
public void handleException(TransportException exp) { | ||
if (exp.getRootCause() instanceof ElasticsearchTimeoutException) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we simply set the timeout to a very high value instead of adding this retry logic here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How high is high enough? I'd prefer to retry forever rather than have to remember that this might time out and stop in a future debugging session.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also make the timeout optional, so that setting it to null makes it unbounded (i.e. does not schedule a timeout)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, introduced nullability to the timeout in 9f7e951.
new TransportResponseHandler<BootstrapClusterResponse>() { | ||
@Override | ||
public void handleResponse(BootstrapClusterResponse response) { | ||
logger.debug("bootstrapped successful: received {}", response); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
successfully?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the message in 12d90d2 - I meant "bootstrapping" not "bootstrapped".
for (int i = 0; i < numSharedDedicatedMasterNodes; i++) { | ||
final Settings.Builder settings = Settings.builder(); | ||
settings.put(Node.NODE_MASTER_SETTING.getKey(), true); | ||
settings.put(Node.NODE_DATA_SETTING.getKey(), false); | ||
if (bootstrapNodeRequired) { | ||
settings.put(INITIAL_MASTER_NODE_COUNT_SETTING.getKey(), numSharedDedicatedMasterNodes); | ||
bootstrapNodeRequired = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why only run the bootstrapping on a single node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's sufficient and more obviously correct, but either way is fine by me. Running it on multiple nodes requires validateClusterFormed()
before the next node starts up, and this isn't especially clear. Your call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this on another channel, and decided to only do the auto-bootstrapping when autoManageMinMaster mode is active. We also decided to have multiple nodes participate in / run the bootstrapping process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 150f47b
(int) Stream.of(settings).filter(Node.NODE_MASTER_SETTING::get).count()) | ||
.put(nodeSettings) | ||
.build(); | ||
bootstrapNodeRequired = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, why only bootstrap one node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 150f47b.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -1081,6 +1081,9 @@ private synchronized void reset(boolean wipeData) throws IOException { | |||
final Settings.Builder settings = Settings.builder(); | |||
settings.put(Node.NODE_MASTER_SETTING.getKey(), true); | |||
settings.put(Node.NODE_DATA_SETTING.getKey(), false); | |||
if (prevNodeCount == 0 && autoManageMinMasterNodes) { | |||
settings.put(INITIAL_MASTER_NODE_COUNT_SETTING.getKey(), numSharedDedicatedMasterNodes + numSharedDataNodes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be
settings.put(INITIAL_MASTER_NODE_COUNT_SETTING.getKey(), numSharedDedicatedMasterNodes +
(numSharedDedicatedMasterNodes > 0 ? 0 : numSharedDataNodes));
because of the condition further below where we make the shared data nodes data-only nodes when there are dedicated master nodes.
…re dedicated masters
Today, the bootstrapping of a Zen2 cluster is driven externally, requiring
something else to wait for discovery to converge and then to inject the initial
configuration. This is hard to use in some situations, such as REST tests.
This change introduces the
ClusterBootstrapService
which brings the bootstrapretry logic within each node and allows it to be controlled via an (unsafe)
node setting.